Gene ranking and biomarker discovery under correlation
نویسندگان
چکیده
MOTIVATION Biomarker discovery and gene ranking is a standard task in genomic high-throughput analysis. Typically, the ordering of markers is based on a stabilized variant of the t-score, such as the moderated t or the SAM statistic. However, these procedures ignore gene-gene correlations, which may have a profound impact on the gene orderings and on the power of the subsequent tests. RESULTS We propose a simple procedure that adjusts gene-wise t-statistics to take account of correlations among genes. The resulting correlation-adjusted t-scores ('cat' scores) are derived from a predictive perspective, i.e. as a score for variable selection to discriminate group membership in two-class linear discriminant analysis. In the absence of correlation the cat score reduces to the standard t-score. Moreover, using the cat score it is straightforward to evaluate groups of features (i.e. gene sets). For computation of the cat score from small sample data, we propose a shrinkage procedure. In a comparative study comprising six different synthetic and empirical correlation structures, we show that the cat score improves estimation of gene orderings and leads to higher power for fixed true discovery rate, and vice versa. Finally, we also illustrate the cat score by analyzing metabolomic data. AVAILABILITY The shrinkage cat score is implemented in the R package 'st', which is freely available under the terms of the GNU General Public License (version 3 or later) from CRAN (http://cran.r-project.org/web/packages/st/).
منابع مشابه
Proteomics Applications in Health: Biomarker and Drug Discovery and Food Industry
Advancing in genome sequencing has greatly propelled the understanding of the living world, however, it is insufficient for full description of a biological system. Focusing on, proteomics has emerged as another large-scale platform for improving the understanding of biology. Proteomic experiments can be used for different aspects of clinical and health sciences such as food technology, biomark...
متن کاملProteomics Applications in Health: Biomarker and Drug Discovery and Food Industry
Advancing in genome sequencing has greatly propelled the understanding of the living world, however, it is insufficient for full description of a biological system. Focusing on, proteomics has emerged as another large-scale platform for improving the understanding of biology. Proteomic experiments can be used for different aspects of clinical and health sciences such as food technology, biomark...
متن کاملSeed-weighted random walk ranking for cancer biomarker prioritisation: a case study in leukaemia
A central focus of clinical proteomics for cancer is to identify protein biomarkers with diagnostic and therapeutic application potential. Network-based analyses have been used in computational disease-related gene prioritisation for several years. The Random Walk Ranking (RWR) algorithm has been successfully applied to prioritising disease-related gene candidates by exploiting global network t...
متن کاملPharmaceutical Advances and Proteomics Researches
Proteomics enables understanding the composition, structure, function and interactions of the entire protein complement of a cell, a tissue, or an organism under exactly defined conditions. Some factors such as stress or drug effects will change the protein pattern and cause the present or absence of a protein or gradual variation in abundances. Changes in the proteome provide a snapshot of the...
متن کاملPharmaceutical Advances and Proteomics Researches
Proteomics enables understanding the composition, structure, function and interactions of the entire protein complement of a cell, a tissue, or an organism under exactly defined conditions. Some factors such as stress or drug effects will change the protein pattern and cause the present or absence of a protein or gradual variation in abundances. Changes in the proteome provide a snapshot of the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 25 20 شماره
صفحات -
تاریخ انتشار 2009